Goto

Collaborating Authors

 third-order smoothness help



Third-order Smoothness Helps: Faster Stochastic Optimization Algorithms for Finding Local Minima

Neural Information Processing Systems

We propose stochastic optimization algorithms that can find local minima faster than existing algorithms for nonconvex optimization problems, by exploiting the third-order smoothness to escape non-degenerate saddle points more efficiently.


Reviews: Third-order Smoothness Helps: Faster Stochastic Optimization Algorithms for Finding Local Minima

Neural Information Processing Systems

This submission is concerned with unconstrained nonconvex stochastic optimization problems, in a setting in which the function to optimize is only available through stochastic estimates. Obtaining points satisfying second-order necessary optimality conditions has been a recent topic of interest at NIPS, as such points can be as good as global minima on several problems arising from machine learning. The authors present new complexity results that improve over the existing complexity bounds for finding an approximate local minimum, identified as a point for which the gradient norm is less than a threshold \epsilon and the minimum Hessian eigenvalue is at least -\sqrt{\epsilon} . By assuming that the objective function possesses a Lipschitz continous third-order derivative, the authors are able to guarantee a larger decrease for steps of negative curvature type: this argument is the key for obtaining lower terms in the final complexity bounds and, as a result, lower dependency on the tolerance \epsilon compared to other techniques ( \epsilon {-10/3} versus \epsilon {-7/2} in previous works). The authors conduct a thorough review of the related literature, and discuss the main differences between existing algorithms and theirs in Section 2. This literature review appears exhaustive, and identifies key differences between this work and the cited ones.


Third-order Smoothness Helps: Faster Stochastic Optimization Algorithms for Finding Local Minima

Yu, Yaodong, Xu, Pan, Gu, Quanquan

Neural Information Processing Systems

We propose stochastic optimization algorithms that can find local minima faster than existing algorithms for nonconvex optimization problems, by exploiting the third-order smoothness to escape non-degenerate saddle points more efficiently. This improves upon the $\tilde{O}(\epsilon {-7/2})$ gradient complexity achieved by the state-of-the-art stochastic local minima finding algorithms by a factor of $\tilde{O}(\epsilon {-1/6})$. Experiments on two nonconvex optimization problems demonstrate the effectiveness of our algorithm and corroborate our theory. Papers published at the Neural Information Processing Systems Conference.